Talk by Geoffrey Hinton, University of Toronto and Canadian Institute for Advanced Research. Given to the Redwood Center for Theoretical Neuroscience on March 22, 2010.
Deep networks can be learned efficiently from unlabeled data. The layers of representation are learned one at a time using a simple learning module that has only one layer of latent variables. The values of the latent variables of one module form the data for training the next module. The most commonly used modules are Restricted Boltzmann Machines or autoencoders with a sparsity penalty on the hidden activities. Although deep networks have been quite successful for tasks such as object recognition, information retrieval, and modeling motion capture data, the simple learning modules do not have multiplicative interactions which are very useful for some types of data.
The talk will show how a third-order energy function can be factorized to yield a simple learning module that retains advantageous properties of a Restricted Boltzmann Machine such as very simple exact inference and a very simple learning rule based on pair-wise statistics. The new module has a structure that is very similar to the simple cell/complex cell hierarchy that is found in visual cortex. The multiplicative interactions are useful for modeling images, image transformations, and different styles of human walking.